perf: Optimize ipc stream read performance #24671

Liyixin95 · 2025-09-30T06:47:52Z

reading a 1.9GB ipc stream file.

before:

________________________________________________________
Executed in    3.05 secs    fish           external
   usr time    2.01 secs  276.00 micros    2.01 secs
   sys time    1.04 secs  443.00 micros    1.04 secs

peek memory: 3980.86 MB

after:

________________________________________________________
Executed in    2.22 secs    fish           external
   usr time    1.48 secs  265.00 micros    1.48 secs
   sys time    0.74 secs  439.00 micros    0.74 secs

peek memory: 2086.41 MB

codecov · 2025-09-30T07:29:13Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 81.90%. Comparing base (b1623ff) to head (a68be9a).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main   #24671      +/-   ##
==========================================
- Coverage   81.92%   81.90%   -0.03%     
==========================================
  Files        1707     1707              
  Lines      235483   235453      -30     
  Branches     3000     3000              
==========================================
- Hits       192915   192836      -79     
- Misses      41801    41850      +49     
  Partials      767      767

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

orlp · 2025-10-03T11:07:44Z

crates/polars-arrow/src/io/ipc/read/stream.rs

                scratch,
            );

+            let new_pos = reader.stream_position()?;


Can you make the reborrow of reader explicit here? That is, pass &mut (&mut *reader).take(block_length as u64) instead? Right now this only works because reader.take automatically gets transformed to (&mut *reader).take. I was really confused how this compiled since take takes self and should consume the &mut R.

I have change this to &mut (&mut *reader).take(block_length as u64). But I still think it's a little verbose...

orlp · 2025-10-03T11:08:02Z

crates/polars-arrow/src/io/ipc/read/common.rs

        .unwrap_or_else(VecDeque::new);
    let mut buffers: VecDeque<arrow_format::ipc::BufferRef> = buffers.iter().collect();

-    // check that the sum of the sizes of all buffers is <= than the size of the file


Can you explain why this check was removed?

I can't easily get file_size after removing the intermediate vec. But remove this check will indeed cause wrong result when file_size and buffer_size not match rather then report an error. Maybe we can change all the subsequent read_to_end to read_excat?

I can't easily get file_size after removing the intermediate vec. But remove this check will indeed cause wrong result when file_size and buffer_size not match rather then report an error. Maybe we can change all the subsequent read_to_end to read_excat?

@orlp kindly pin in case you forget this thread.

@Liyixin95 I think that should be the way forward then yes. I do want the length to be checked.

@orlp now all subsequent read_to_end has been change to read_excat. read_excat require vec initial overhead, but I could try to eliminate this using read_buf_excat in another pr, if unstable api is acceptable.

@Liyixin95 Oh... I hadn't considered that it would add overhead. Could you change back to reserve + take + read_to_end and do a manual assert afterwards that checks the length?

This reverts commit 4b9aa30.

orlp · 2025-10-27T10:19:44Z

Thanks, sorry it took a while to review :)

optimize ipc stream read performance

d398c04

Liyixin95 requested review from MarcoGorelli, alexander-beedie, c-peters, orlp, reswqa and ritchie46 as code owners September 30, 2025 06:47

github-actions bot added performance Performance issues or improvements python Related to Python Polars rust Related to Rust Polars labels Sep 30, 2025

fix

8328ff3

orlp requested changes Oct 3, 2025

View reviewed changes

liyixin added 2 commits October 4, 2025 09:39

make reborrow explicit

2bca949

rewrite read_to_end to read_excat

4b9aa30

ritchie46 force-pushed the main branch 3 times, most recently from 90ceb7b to e9fce55 Compare October 26, 2025 16:01

liyixin and others added 4 commits October 27, 2025 16:02

Revert "rewrite read_to_end to read_excat"

bf0b689

This reverts commit 4b9aa30.

use assert to check length

0c9522d

Merge branch 'main' into optimize_ipc

1256658

merge upstream change

a68be9a

orlp approved these changes Oct 27, 2025

View reviewed changes

orlp merged commit 48840c6 into pola-rs:main Oct 27, 2025
23 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: Optimize ipc stream read performance #24671

perf: Optimize ipc stream read performance #24671

Uh oh!

Liyixin95 commented Sep 30, 2025

Uh oh!

codecov bot commented Sep 30, 2025 •

edited

Loading

Uh oh!

orlp Oct 3, 2025 •

edited

Loading

Uh oh!

Liyixin95 Oct 4, 2025

Uh oh!

orlp Oct 3, 2025

Uh oh!

Liyixin95 Oct 4, 2025

Uh oh!

Liyixin95 Oct 17, 2025

Uh oh!

orlp Oct 20, 2025

Uh oh!

Liyixin95 Oct 26, 2025

Uh oh!

orlp Oct 27, 2025

Uh oh!

Liyixin95 Oct 27, 2025

Uh oh!

orlp commented Oct 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

perf: Optimize ipc stream read performance #24671

perf: Optimize ipc stream read performance #24671

Uh oh!

Conversation

Liyixin95 commented Sep 30, 2025

Uh oh!

codecov bot commented Sep 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

orlp Oct 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

orlp commented Oct 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov bot commented Sep 30, 2025 •

edited

Loading

orlp Oct 3, 2025 •

edited

Loading